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ABSTRACT 

-We' discuss  interior-point  methods  for  linear  programming  derived  bf  applying  a  logarithmic 
barrier  transformation  and  performing  projected  Newton  steps  for  a  sequence  of  barrier  parame¬ 
ters.  Under  certain  conditions,  one  of  these'*projected  Newton  bamer*methods  is  shown  to  be 
equivalent  to  Karmarkar’s  (1984)  projective  method  for  linear  programming. 

Details  are  given  of  a  specific  barrier  algorithm  and  its  practical  implementation.  Numerical 
results  are  given  for  several  nontrivial  test  problems. 
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A  Projected  Newton  Bonier  Method 

1.  Introduction 

Interest  in  linear  programming  methods  arises  in  at  least  two  different  contexts:  theoretical  and 
practical.  The  long-established  simplex  method,  developed  by  G.  B.  Dantsig  in  the  late  1040’s, 
has  been  known  from  the  beginning  to  be  of  combinatorial  complexity  in  the  wont  case.  However, 
in  practice  it  tends  to  require  a  number  of  iterations  that  is  approximately  linear  in  the  problem 
dimension.  Furthermore,  a  typical  iteration  of  the  simplex  method  involves  relatively  little  work. 
After  computing  an  initial  factorization  of  a  square  matrix  (the  basis),  each  subsequent  simplex 
iteration  requires  only  a  rank-one  update  of  this  square  matrix.  Although  linear  programs  are 
often  very  large,  the  constraint  matrix  is  normally  very  sparse.  Sparse-mat rix  techniques  have 
developed  to  the  point  where  the  factorization  and  updates  required  in  the  simplex  method  can  be 
performed  not  only  rapidly,  but  also  with  assured  numerical  stability  (see  the  survey  by  Gill  et  a J., 
1984).  From  a  practical  viewpoint,  these  two  features  —  a  typically  linear  number  of  iterations, 
and  fast  methods  for  performing  each  iteration  —  imply  that  the  simplex  method  is  an  effective 
and  reliable  algorithm  for  linear  programming,  despite  its  seemingly  unfavorable  complexity. 

Many  researchers,  beginning  with  Dantzig  himself,  have  observed  the  seemingly  unsatisfac¬ 
tory  feature  that  the  simplex  method  traverses  the  boundary  of  the  feasible  region.  From  the 
outset,  attempts  have  been  made  to  develop  practical  linear  programming  methods  that  cross 
the  interior  of  the  feasible  region  —  for  example,  von  Neumann  (1947),  Hoffman  et  a I.  (19S3), 
Tompkins  (1955, 1957)  and  Frisch  (1957).  Such  methods  have  sometimes  involved  the  application 
of  nonlinear  techniques  to  linear  programs.  However,  none  of  these  methods  has  previously  been 
claimed,  even  by  its  developers,  to  be  competitive  in  speed  with  the  simplex  method  for  general 
linear  programs. 

On  the  theoretical  side,  researchers  attempted  for  many  years  to  develop  a  linear  program¬ 
ming  algorithm  with  only  polynomial  complexity.  In  1979,  to  the  accompaniment  of  wide  pub¬ 
licity,  this  issue  was  resolved  when  Khachiyan  (1979)  presented  a  worst-case  polynomial-time 
method  based  on  a  nonlinear  geometry  of  shrinking  ellipsoids.  Although  initially  it  was  thought 
that  the  ellipsoid  methods  might  be  as  fast  in  practice  as  the  simplex  method,  these  hopes  have 
not  been  realized.  Broadly  speaking,  there  are  two  mqor  difficulties:  first,  the  number  of  itera¬ 
tions  tends  to  be  very  large;  second,  the  computation  associated  with  each  iteration  is  much  more 
costly  than  a  simplex  iteration  because  sp arse-matrix  techniques  are  not  applicable. 

Within  the  past  year,  interest  in  linear  programming  has  been  intensified  by  the  publication 
(Karmarkar,  1984)  and  discussion  of  a  linear  programming  algorithm  that  is  not  only  polynomial 
in  complexity,  but  also  is  claimed  to  be  much  faster  than  the  simplex  method  for  practical 
problems. 

In  Section  2,  we  first  examine  the  well  known  barrier-function  approach  to  solving  optimisa¬ 
tion  problems  with  inequality  constraints,  and  derive  a  representation  for  the  Newton  search 
direction  associated  with  the  subproblem.  In  Section  3,  we  show  a  formal  equivalence  between 
the  Newton  search  direction  and  the  direction  associated  with  Karmarkar’s  (1984)  algorithm. 
Section  4  describes  a  complete  interior-point  method  for  linear  programming  based  on  the  barrier 
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transformation,  and  Section  5  gives  some  numerical  results  obtained  with  a  preliminary  imple¬ 
mentation  of  that  method.  The  implications  of  these  results  and  directions  for  future  research 
are  discussed  in  Section  6. 

1.2.  Notation.  The  term  projective  method  will  denote  the  algorithm  given  by  Karmarfcar 
(1984)  for  the  linear  program  (3.1);  see  below.  The  term  barrier  method  will  often  be  used  as  an 
abbreviation  tor  projected  Newton  barrier  method.  The  vector  norm  Q  •  (}  will  always  denote  the 
Euclidean  norm  ||v||2  =  (w7*;)1/2. 

2.  A  Barrier-Function  Approach 

2.1.  Applying  a  barrier  transformation  to  a  linear  program.  Barrier-function  methods 
treat  inequality  constraints  by  creating  a  barrier  function,  which  is  a  combination  of  the  original 
objective  function  and  a  weighted  sum  of  functions  with  a  positive  singularity  at  the  constraint 
boundary.  (Many  barrier  functions  have  been  proposed;  we  consider  only  the  logarithmic  barrier 
function,  first  suggested  by  Frisch,  1955.)  As  the  weight  assigned  to  the  singularities  approaches 
zero,  the  minimum  of  the  barrier  function  approaches  the  minimum  of  the  original  constrained 
problem.  Barrier-function  methods  require  a  strictly  feasible  starting  point  for  each  minimisation, 
and  generate  a  sequence  of  strictly  feasible  iterates.  (The  definitive  work  on  barrier  functions 
remains  Fiacco  and  McCormick,  1968;  overviews  are  given  by  Fletcher,  1981,  and  Gill,  Murray 
and  Wright,  1981.) 

Consider  applying  a  barrier-function  method  to  the  following  linear  program: 

minimize  cTz 

•€»"  (2.1) 
subject  to  Ax  —  b,  x  >  0, 

where  A  is  an  m  x  n  matrix  with  m  <  n.  The  subproblem  to  be  solved  within  a  barrier-function 
method  is: 

H 

minimize  Fix)  =  eTx  -  p  >  In*.- 

•6»"  (M) 

subject  to  Ax  =  b, 

where  the  scalar  p  [ft  >  0)  is  known  as  the  barrier  parameter  and  is  specified  for  each  subproblem. 
The  equality  constraints  cannot  be  treated  by  a  barrier  transformation,  and  thus  are  handled 
directly.  If  z  (p)  is  the  solution  of  (2.2),  then  x  (p)  -*  x  as  p  -»  0,  where  x  is  a  solution  of  (2.1) 
(see,  e.g.,  Fiacco  and  McCormick,  1968).  Very  strong  order  relations  can  be  derived  concerning 
x  (p)  and  cTz  (p)  (see,  e.g.,  Mifflin,  1972a,  b;  Jittomtrum,  1978;  Jittomtrum  and  Osborne,  1978). 
In  particular,  when  (2.1)  is  nondegenerate, 

|l**(M)-as*||  =0(m)  (2.3a) 

for  sufficiently  small  p.  When  (2.1)  is  degenerate,  the  corresponding  relation  is 

II*  (m)  -  «*ll  -  0(V5i). 


(2.34) 
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2.2.  Solution  of  tho  subproblam.  Given  a  linearly  constrained  problem  of  the  form 


minimise  F(x)  subject  to  Ax  ■  4 ,  (2.4) 

a  standard  approach  is  to  use  a  feasible-point  descent  method  (see,  e.g.,  Gill,  Murray  and  Wright, 
1981).  The  current  iterate  x  always  satisfies  Ax  =  b ,  and  the  next  iterate  2  is  defined  as 

2  =  x  +  ap,  (2.5) 

where  p  is  an  n-vector  (the  search  direction),  and  a  is  a  positive  scalar  (the  ateplength).  The 
computation  of  p  and  a  must  ensure  that  At  =  6  and  F(t)  <  F(x). 

The  Newton  search  direction  associated  with  (2.4)  is  defined  as  the  step  to  the  minimum  of 
the  quadratic  approximation  to  F(x)  derived  from  the  local  Taylor  series,  subject  to  retaining 
feasibility.  Thus,  the  Newton  search  direction  is  the  solution  of  the  following  quadratic  program: 

mhjmjK  /Tp  +  \pTHp  (2.6) 

subject  to  ;4p  =  0, 

where  y  =  VF(x)  and  H  =  VJf(*).  If  v  is  the  vector  of  Lagrange  multipliers  for  the  constraints 
in  (2.6),  then  the  required  solution  satisfies  the  linear  system 

/ B  AT\/-p\  /y\ 


a  f)(7)-o 


Note  that  r  converges  to  the  Lagrange  multipliers  for  the  constraints  Ax  =  b  in  the  original 
problem  (2.4). 

2.8.  The  projected  Newton  search  direction.  When  F(x)  is  the  barrier  function  in  (2.2), 
its  derivatives  are 

g(z)  =  c-pD~ic  and  H(z)  =  pD~7, 

where 

D  =  diag  (xj),  y  =  1, . . . ,  n,  (2.8) 

and  e  =  (1, 1, . . . ,  l)r.  Note  that  y  and  H  are  well  defined  only  if  ay  /  0  for  all  j. 

Let  pa  (the  projected  Newton  barrier  direction)  denote  the  Newton  search  direction  defined 
by  (2.6)  when  F  is  the  barrier  function  of  (2.2).  The  associated  Lagrange  multipliers  will  be 
denoted  by  vB.  Since  H(x)  is  positive  definite  when  x  >  0,  pm  is  finite  and  unique,  and  is  a 
descent  direction  for  F(x),  i.e.,  ( c  -  pD~le)TpB  <  0. 

It  follows  from  (2.7)  that  pa  and  xm  satisfy  the  equation 
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Rewriting  (2.9)  in  terms  of  a  Tec  tor  ra  defined  by  Dra  =  -fipa,  we  see  that  ra  and  xa  satisfy 

U  rx::)-(\-) 

It  follows  that  xa  is  the  solution  and  ra  the  optimal  residual  of  the  following  linear  least-squares 
problem: 

minimise  ||Z?e  -  fie  -  DATr Q.  (2.11) 

The  projected  Newton  barrier  direction  is  then 

P.  =  -(1  /p)DrB.  (2.12) 

For  a  given  positive  fi,  Newton’s  method  will  eventually  reach  a  domain  in  which  unit  steps 
along  the  directions  pa  will  be  feasible.  The  iterates  can  then  be  expected  to  converge  quadrat- 
ically.  In  general,  the  smaller  fi,  the  smaller  the  attractive  domain.  The  algorithm  remains  well 
defined  as  fi  tends  to  zero,  and  the  limiting  case  can  be  safely  simulated  (in  practice)  by  using  a 
very  small  value  such  as  p  =  10"1®. 

Note  that  feasible-direction  methods  are  independent  of  the  scaling  of  the  search  direction, 
if  the  steplength  algorithm  is  chosen  appropriately.  We  could  therefore  define  the  barrier  search 
direction  to  be 

Pa  =  —Drm  (2.13) 

for  any  fi>  0.  The  ideal  Newton  step  would  then  be  a  »  1/p. 

The  barrier  search  direction  (2.13)  with  p  *  0  is  used  in  an  algorithm  proposed  by  Vanderbei, 
Meketon  and  Freedman  (1985).  We  see  that  such  an  algorithm  has  no  domain  of  quadratic 
convergence. 

2.4.  Upper  bounds.  The  barrier  transformation  and  the  associated  Newton  search  direction 
can  also  be  defined  for  linear  programs  with  upper  and  lower  bounds  on  the  variables,  of  the 
form: 

minimise  eTx 
•€»" 

subject  to  Ax  —  b ,  l  <  *  <  u. 

The  subproblem  analogous  to  (2.2)  is 

a  m 

minimise  eTx  -  p  £  ln(*/  -  fy)  -  p  £  ln(u/  -  */) 

*6*"  /si  jml 

subject  to  li>i. 

The  Hessian  of  the  associated  barrier  function  will  be  positive  definite  only  if  at  least  one  of  tj 
or  uy  is  finite  for  every  /.  hi  this  case,  the  least-squares  problem  analogous  to  (2.11)  is 
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Here,  the  matrices  D  and  D  are  defined  bj  D  =  diag(jy)  and  D  —  diag(£y),  where 
6i  =  !/(!/«*  +  !/*?)*  “d  Sj  =  ~  !/*/)> 

with  e,  xy  -  tj  and  ty  =  uy  -  xy .  Fbr  simplicity,  the  remainder  of  the  discussion  will  assume  that 
the  bounds  are  of  the  form  in  (2.1),  i.e.,  0  <  *y  <  oo,  except  for  the  artificial  variable  discussed 
in  Section  4.2. 


3.  Relationship  with  Karmarkar’s  Projective  Method 

In  this  section,  we  show  the  connection  between  the  barrier  and  projective  methods.  We  assume 
that  the  reader  is  familiar  with  the  projective  method;  a  good  description  is  given  in  Todd  and 
Burrell  (1985). 

S.l.  Summary  of  the  projective  method.  In  the  projective  method,  the  linear  program  is 
assumed  to  be  of  the  special  form 


minimize  cTz 
*6ft" 

subject  to  Cz  —  0,  eTx  =  1,  z  >  0. 
Let  zK  denote  a  solution  of  (3.1).  It  is  also  assumed  that 


(3.1) 


e 


r  * 
** 


0, 


(3.2) 


and  that  Ge  =  0.  (These  assumptions  can  always  be  assured  by  transforming  the  problem.) 
The  optimality  conditions  for  (3.1)  imply  that 


e  =  CT Xc  +  eA,  +  if, 


(3.3) 


where  if  is  the  Lagrange  multiplier  vector  for  the  bound  constraints  of  (3.1).  The  complementarity 
conditions  at  zK  imply  that 

Xjflj  =  °,  j  —  1,  ...,n,  (3.4) 

where  zy  denotes  the  j- th  component  of  zK .  Taking  the  inner  product  of  e  and  x*  and  using 
(3.2)-(3.4),  we  obtain  A«  eTxK  =  0.  Since  eTzK  =  1,  it  follows  that  A«  =  0. 

Any  strictly  positive  diagonal  matrix  D  defines  the  following  projective  transformations, 
which  relate  any  strictly  feasible  point  x  and  the  transformed  point  zf: 


x'  = 


eTD~lz 


D~lz, 


1 

eTW 


DJ. 


(3.5) 


In  the  projective  method,  given  an  iterate  x,  D  is  defined  as  diag(xj, . . . ,  x*).  (Note  that  D  is  the 
same  as  the  diagonal  matrix  (2.8)  associated  with  the  barrier  method,  and  that  De  =  x.)  The 
next  iterate  in  the  transformed  space  is  given  by 
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where  rK  =  Dc  -  DC Tw„  -  ^e  is  the  optimal  residual  of  the  linear  least-squares  problem 


minimise  jj  De  —  ( DCT 


(3.7) 


The  steplength  a'  in  (3.6)  is  chosen  to  ensure  strict  feasibility  of  2*  as  well  as  descent  in  the 
transformed  “potential  function9  (see  Karmarkar,  1984,  for  details). 

The  new  iterate  tK  in  the  original  parameter  space  is  obtained  by  applying  the  transformation 
(3.5)  to  if,  so  that 


= 


*  cTD(x'-a'rK) 
where  7  is  chosen  to  make  eTtK  =  1. 


Z>(*'  -  a'rK)  =  7(*  -  dDrK), 


3.2.  Properties  of  the  projective  method.  Let  rc  be  defined  as  the  solution  of  the  least- 
squares  problem 

minimise  || Dc  —  DCtm |), 


and  let 

tlc-e-  CTwc, 
Me  =  *TDtic. 


For  reference,  we  state  some  properties  of  various  quantities  appearing  in  the  projective  method. 
Lemma  3.1.  The  solution  of  (3.7)  is  wK  —  wc  and  4  —  cTx/n. 

Lemma  3.2.  In  the  projective  method, 


rK  =  Dric  -  #e, 

d  =  no*, 

7  =  1/(1  +  d(4-  pc)), 
and  the  new  iterate  may  be  written  as 

Pm  =  Me*  ~  D9ric, 

*k  =*  +  d7p*. 

3.3.  Balationship  with  the  barrier  search  direction.  When  the  barrier  method  is  applied 
to  problem  (3.1),  we  obtain  va  and  #  as  the  solution  of  the  least-squares  problem 

minimise  \\De  -  pe  -  D{  CT  (3.8) 

and  take 

ra  =  Dc  -  pe  -  DCtwb  -  §De, 

Pa  =  “(1  Im)Dtb, 

*a  =  *  +  <*»., 


■,l  I^A1-'-J  »_U.' 


TSv*3vTT 


•S.TJ.'J. 1 


■JL'Jl’J.'yjl’J.’JL'JJ 
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for  some  p  and  a.  We  assume  that  the  matrices  (  DCT  t )  and  (  CT  e )  have  full  rank,  so  that 
the  solutions  of  (3.7)  and  (3.8)  are  unique. 

Lemma  S.S.  If  the  barrier  parameter  ia  chosen  to  be  ft  =  pc,  then  8  =  0  and  wB  =  rc. 

Lemma  3.4.  If  p  =  ft c,  then 

ra  =  Dflc  -  Mc«. 

Pa  =  (l/Mc)(Mc*  -  D7rtc), 

SB  =  x  +  apB. 


Comparing  Lemmas  3.2  and  3.4,  the  main  result  now  follows. 

Theorem  3.1.  Suppose  the  projective  method  and  the  barrier  method  are  applied  to  problem 
(3.1).  If  the  barrier  parameter  ia  ft  =  ft c,  the  search  directions  pm  and  pK  are  parallel.  Farther, 
if  the  steplengths  satisfy  a  =  a'jPc,  the  iterates  XB  and  fB  are  identical. 

Theorem  3.1  is  an  existence  result,  showing  that  a  special  case  of  the  barrier  method  would  follow 
the  same  path  as  the  projective  method.  This  does  not  mean  that  the  barrier  method  should 
be  specialized.  For  example,  the  value  ft  =  pc  is  admissible  in  the  barrier  method  only  if  i*  is 
positive.  As  it  happens,  pc  tends  to  zero  and  is  likely  to  be  positive  in  a  neighborhood  of  the 
solution.  It  could  therefore  be  a  satisfactory  choice  as  the  solution  is  approached. 

Similarly,  whatever  the  choice  of  p,  as  the  barrier  method  converges  to  a  solution  of  the 
original  problem,  0  must  converge  to  A,,  which  is  zero.  This  is  consistent  with  the  choice  p  =  pc, 
which  gives  0  =  0  directly. 
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4.  A  Projected  Newton  Barrier  Algorithm 

hi  this  section,  we  give  details  of  the  barrier  algorithm  used  to  obtain  the  numerical  results 
of  Section  5.  Each  iteration  is  of  the  form  X  =  x  +  ap  (2.5),  where  the  search  direction  is 
the  barrier  direction  pB  defined  by  (2.11)— (2.12),  and  the  barrier  parameter  may  be  changed  at 
every  iteration.  Any  tolerances  described  are  intended  to  be  suitable  for  machines  whose  relative 
precision  is  about  10~1S. 

Alternative  approaches  to  certain  parts  of  the  algorithm  are  discussed  in  Section  6. 

4.1.  The  main  steps.  At  the  start  of  each  iteration,  the  quantities  ft,  x,  w  and  if  are  known, 
where  ft  >  0,  x  >  0,  Ax  =  b,  and  if  —  c  —  ATr.  For  computational  reasons  we  compute  a  correction 
to  rc  at  each  stage,  since  a  good  estimate  is  available  from  the  previous  iteration.  The  main  steps 
of  the  iteration  then  take  the  following  form. 

1.  Define  D  =  diag(zy)  and  compute  r  =  Drf  -  fte. 

2.  Terminate  if  ft  and  ||r||  are  sufficiently  small. 

3.  If  appropriate,  reduce  ft  and  recompute  r  =  Drf  —  fte. 

4.  Solve  the  least-squares  problem 

minimise  ||r  -  DArtor|).  (4.1) 

5.  Update  *•«-*■  +  Sr,  t|*-if  —  AT5r,  and  set  r  =  Drf  -  fte,  p  —  — (l/|i  )Dr. 

6.  Find  aM,  the  maximum  step  a  such  that  z  +  op  >  0. 

7.  Determine  a  steplength  a  €  (0,  a«)  at  which  the  barrier  function  F(x  +  ap)  is  suitably  less 
than  F(z). 

8.  Update  z  «-  z  +  ap. 

All  iterates  satisfy  Ax  =  6,  z  >  0,  and  the  vectors  w  and  if  approximate  the  dual  variables  and 
reduced  costs  for  the  original  linear  program. 

The  vector  r  serves  two  purposes.  In  step  4,  r  is  the  right-hand  side  for  the  least-squares 
problem,  and  in  step  5  it  becomes  the  residual  vector  at  the  least-squares  solution.  In  either 
case,  r  — »  0  as  convergence  occurs,  so  that  Dtf  — *  fte.  Hence,  the  reduced-cost  estimates  should 
ultimately  satisfy  if  >  0,  as  one  might  expect. 

4.2.  The  feasibility  phase.  In  order  to  apply  the  barrier  algorithm  to  (2.1),  a  strictly  feasible 
starting  point  is  necessary.  Such  a  point  may  be  found  by  the  following  "phase  1*  procedure 
in  which  a  barrier  method  is  applied  to  a  modified  linear  program.  For  any  given  initial  point 
zo  >  0,  we  define  £o*  =  b  —  Ax o  with  ||«||  =  1,  and  solve  the  modified  linear  program 

minimize  £ 

«,( 


subject  to  (A  * )  ^  *  ^  =  5,  *  >  0,  £  >  0, 


'  •  •  »  •  •  *  i  •  *  *  »  <  „  • 
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using  the  feasible  starting  point  xq  >  0,  £o  =  ||6  —  Ax o||.  (Note  that,  even  if  A  is  sparse,  the 
additional  column  »  in  (4.2)  will  in  general  be  dense.)  In  our  experiments  we  have  used  zo  =  ||6||e. 

When  £  =  0,  a  suitable  point  has  been  found.  Since  the  barrier  transformation  will  not 
allow  £  to  reach  the  desired  value  of  zero,  £  must  be  treated  differently  from  the  other  variables 
in  solving  (4.1)  with  a  barrier  algorithm.  In  our  implementation,  the  search  direction  p  and  the 
maximum  step  a  are  computed  as  if  the  variable  £  were  subject  to  the  bound  £  >  —1.  If  the  step 
a  causes  £  to  become  negative,  an  appropriate  shorter  step  is  taken  and  phase  1  is  terminated. 
The  original  linear  program  is  presumed  to  be  infeasible  if  the  final  £  is  positive  for  a  sufficiently 
small  value  of  /*. 

As  an  alternative,  we  note  that  the  convergence  of  the  barrier  method  appears  to  be  moder¬ 
ately  insensitive  to  the  choice  of  linear  objective  function.  This  suggests  a  single-phase  algorithm 
in  which  an  objective  function  of  the  form  w  eTx  +  £  is  used  in  (4.2),  for  some  positive  value  of 
the  scalar  <j.  When  £  reaches  zero,  it  can  thereafter  be  excluded  from  the  problem.  If  a  single 
value  of  u)  can  be  retained  at  every  iteration,  only  a  slight  change  in  the  definition  of  the  linear 
program  is  required  after  a  feasible  point  is  found.  Some  preliminary  results  with  u  fixed  at 
0.01/||c||  seem  promising;  see  Section  5.  In  general,  a  sequence  of  decreasing  values  of  u  may  be 
needed  to  ensure  that  a  feasible  point  is  always  obtained  if  one  exists. 

4.S.  Solution  of  the  least-squares  subproblems.  For  problems  of  even  moderate  size,  the 
time  required  to  perform  an  iteration  will  be  dominated  by  solution  of  the  least-squares  problem 
(4.1).  The  widespread  interest  in  interior-point  methods  has  arisen  because  of  their  reported  speed 
on  large-scale  linear  programs.  Consequently,  problem  (4.1)  must  be  solved  when  A  is  large  and 
sparse.  Fortunately,  methods  for  sparse  least-squares  problems  have  improved  dramatically  in 
the  past  decade.  (For  a  recent  survey,  see  Heath,  1984.) 

An  obvious  approach  to  minimizing  ||r  —  DAT6w ||  is  to  solve  the  associated  normal  equations 

AD7AT6r  =  ADr  (4.3) 

using  the  Cholesky  factorization  AD7AT  =  RTR  with  R  upper  triangular.  Reliable  software 
exists  for  factorizing  symmetric  definite  systems,  notably  SPARSPAK-A  (George  and  Liu,  1981; 
George  and  Ng,  1984).  If  the  original  LP  (2.1)  is  non-degenerate,  the  matrix  AD7AT  will  be  non¬ 
singular  even  at  the  solution.  However,  for  a  degenerate  problem,  AD7AT  becomes  increasingly 
ill-conditioned  as  the  solution  is  approached,  and  the  accuracy  of  the  computed  version  of  AD7AT 
correspondingly  deteriorates.  Furthermore,  any  dense  columns  in  A  (such  as  « in  phase  1)  degrade 
the  sparsity  of  R. 

To  alleviate  these  difficulties,  we  have  used  a  “hybrid*  method  in  which  the  least-squares 
problems  are  solved  by  a  conjugate- gradient  method  ( LSQR ;  Paige  and  Saunders,  1982)  with  a 
triangular  preconditioner  R.  Thus,  an  iterative  method  is  applied  to 
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and  the  correction  Sx  is  recovered  by  solving  RSx  =  x. 

The  preconditioner  R  comes  from  the  Cholesky  factorisation  of  a  sparse  matrix  that  approx¬ 
imates  AD2AT.  Thus, 

AD2At  «  AD2AT  =  RtR,  (4.5) 

where  A  and  R  are  obtained  as  follows. 

1.  Before  beginning  the  barrier  algorithm,  a  preliminary  row  permutation  is  obtained  from  the 
symbolic  factorization  of  a  matrix  AAT,  where  A  is  A  with  certain  columns  replaced  by  zero. 
For  the  results  of  Section  5,  we  excluded  the  artificial  vector  $  in  (4.2)  and  any  columns 
of  A  containing  50  or  more  nonzeros.  Subroutines  Genqmd  and  Smhfet  of  George  and  Liu 
(1981)  were  then  used  to  obtain  a  minimum-degree  ordering  P  and  to  set  up  appropriate 
data  structures  for  the  subsequent  numerical  factorizations. 

2.  At  each  iteration  of  the  barrier  algorithm,  further  columns  and/or  rows  of  A  may  be  replaced 
by  zero:  columns  for  which  xy  <  10~6,  and  rows  that  have  been  marked  for  exclusion  during 
earlier  iterations.  Subroutine  Gtfct  of  George  and  Liu  (1981)  is  then  used  to  obtain  the 
Cholesky  factorization 


PAD2ATPT  —  11*11,  V  upper  triangular, 

with  the  proviso  that  if  a  diagonal  element  of  U  satisfies  u3f  <  10~13,  then  the  t-th  row  of 
U  is  replaced  by  ej,  and  the  t-th  row  of  PA  is  marked  for  exclusion  in  later  iterations.  The 
preconditioner  for  (4.4)  is  then  R  =  UP. 

3.  After  each  iteration,  any  variables  satisfying  zy  <  10“ 10  are  changed  to  zero  for  the  remaining 
iterations.  This  (conservative)  test  is  unlikely  to  remove  the  “wrong”  variables  from  the 
problem,  but  it  allows  some  economy  in  computing  R  and  solving  the  least-squares  problems. 

The  performance  of  LSQR  is  strongly  affected  by  the  quality  of  the  preconditioner,  and  by 
the  specified  convergence  tolerance  ATOL  (see  Paige  and  Saunders,  1982).  With  the  present 
implementation,  we  have  AD2AT  =  RTR  +  Ei  +  E3,  where  E\  has  low  rank  and  ||£a||  is  small; 
the  value  of  ATOL  is  taken  as  10-12.  In  this  situation,  LSQR  typically  requires  only  one  or  two 
iterations  to  achieve  acceptable  accuracy  in  phase  2,  and  only  two  or  three  iterations  in  phase  1. 

There  is  scope  in  future  work  for  degrading  the  approximation  (4.5)  to  obtain  a  sparser  R 
more  quickly,  at  the  expense  of  further  iterations  in  LSQR.  hi  fact,  Gay  (1985)  has  reported 
considerable  success  in  the  analogous  task  of  preconditioning  the  symmetric  conjugate- gradient 
method  in  order  to  solve  the  normal  equations  (4.3).  We  discuss  this  further  in  Section  6.1. 

4.4.  Determination  of  the  steplength.  The  steplength  a  in  (2.5)  is  intended  to  ensure  a 
reduction  in  the  barrier  function  F(x)  in  (2.2)  at  every  iteration.  Let  /(a)  denote  F(x  +  op), 
treated  as  a  function  of  a,  and  let  aM  be  the  largest  positive  feasible  step  along  p.  When 
P  —  Pm »  /*(0)  <  0;  by  construction  of  a  positive  singularity  at  the  boundary  of  the  feasible 
region,  /*(aM)  =  +oo.  Thus,  there  must  exist  a  point  a  in  the  interval  (0,aM)  such  that 
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f(a  )  =  0.  Because  of  the  special  form  of  /,  a  is  unique  and  b  the  univariate  minimiser  of  /(a) 
for  a  €  [0,aM|. 

In  our  algorithm,  a  is  an  approximation  to  a  sero  of  the  function  /*(<*).  1°  order  to  obtain  a 
“sufficient  decrease”  in  F  (in  the  sense  of  Ortega  and  Rheinboldt,  1970),  an  acceptable  a  b  any 
member  of  the  set 

r  =  {a  :  |/»l  <  -mo)}, 

where  /?  b  a  number  satisfying  0  <  <  1.  (The  smaller  the  value  of  /?,  the  closer  the  approxi¬ 

mation  of  a  to  a  sero  of  /*.) 

The  computation  of  an  acceptable  steplength  involves  an  iterative  procedure  for  finding 
a  sero  of  /'.  Many  efficient  algorithms  have  been  developed  for  finding  the  sero  of  a  general 
univariate  function  (see,  e.g.,  Brent,  1973),  based  on  iterative  approximation  by  &  low-order 
polynomial.  However,  such  methods  tend  to  perform  poorly  in  the  presence  of  singularities. 
In  order  to  overcome  this  difficulty,  special  steplength  algorithms  have  been  devised  for  the 
logarithmic  barrier  function  (e.g.,  Fletcher  and  McCann,  1969;  Murray  and  Wright,  1976).  These 
special  procedures  are  based  on  approximating  /(a)  by  a  function  with  a  similar  singularity. 

Given  an  interval  I  such  that  a  €  /  and  T  C  /,  a  new  interval  I  (/  C  I)  b  generated  using 
a4,  the  sero  of  a  simple  monotonic  function  ^(a)  that  approximates  f( a).  Let  aa  €  /  be  the 
current  best  estimate  of  a  .  Define  the  function  ^(a)  to  be 

+(<*)  -  It  +  “““  * 

-  a 

where  the  coefficients  71  and  73  are  chosen  such  that  j(am)  —  f(aB)  and  4'ia»)  —  /"(<*«)•  The 
new  estimate  of  the  sero  of  /'(a)  is  then  given  by 

a*  =  av  +  73/71- 

Using  this  prescription,  a  sequence  of  intervals  {//}  b  generated  such  that  Io  =  [0,  aM], 
Ij  C  Ij—i  and  r  C  Jy.  (For  additional  detaib,  see  Murray  and  Wright,  1976.)  The  first  point  a4 
that  lies  in  T  is  taken  as  a. 

In  practice,  we  find  that  a  close  approximation  to  the  minimum  of  F(x+ap)  can  be  obtained 
after  very  few  estimates  a,  (typically  1,  2  or  3).  However,  the  minimum  b  usually  vei7  close  to 
a w.  Thus,  if  an  accurate  search  b  performed,  at  least  one  variable  will  become  very  near  to  its 
bound.  Sometimes  this  may  be  beneficial,  but  in  phase  1  particularly,  the  danger  exbts  that  the 
optimal  value  of  that  variable  could  be  well  away  from  its  bound.  Although  convergence  b  still 
assured,  the  rate  of  convergence  may  temporarily  be  poor. 

To  guard  against  this,  we  set  0  =  0.999  and  use  0.9aM  as  an  initial  step,  which  b  normally 
accepted.  If  necessary,  we  compute  the  sequence  of  estimates  a*  as  described. 

4.5.  Choice  of  the  barrier  parameter.  In  a  "classical”  barrier-function  method  (e.g.,  as 
described  in  Fiacco  and  McCormick,  1968),  the  usual  procedure  b  to  choose  an  initial  value  of  j», 
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solve  the  subproblem  (2.2),  and  then  decrease  p  (say,  by  multiplying  by  a  constant).  In  order  for 
x  (p)  to  converge  to  x  ,  it  is  essential  that  p  -»  0.  If  the  barrier  search  direction  and  a  steplength 
as  defined  in  Section  4.4  are  used  to  solve  (2.2)  with  a  fixed  p,  standard  proofs  for  descent  methods 
(see,  e.g.,  Ortega  and  Rheinboldt,  1970)  can  be  applied  to  guarantee  convergence  to  x  (/*).  When 
(2.1)  is  non-degenerate  and  p  is  ‘sufficiently  small”  (say,  p  —  it  follows  from  (2.3a)  that 
the  final  iterate  of  the  barrier  method  will  approximate  the  solution  of  the  linear  program  (2.1)  to 
within  the  accuracy  specified  by  pm\a  for  a  non-degenerate  problem.  If  the  problem  is  degenerate, 
(2.36)  implies  that  the  solution  will  be  less  accurate. 

Various  strategies  for  changing  p  can  be  devised.  The  main  aim  is  to  reduce  p  as  quickly 
as  possible,  subject  to  ensuring  steady  progress  toward  the  solution.  For  example,  only  a  single 
step  of  Newton’s  method  could  be  performed  for  each  of  a  decreasing  sequence  of  p- values. 
Alternatively,  each  value  of  p  could  be  retained  until  the  new  iterate  satisfies  some  convergence 
criterion  for  the  subproblem. 

The  vector  r  =  Drj  -  pc  =  D(g  -  ATr)  may  be  used  to  measure  convergence  for  the  current 
subproblem,  since  it  is  a  scaled  form  of  g  -  ATx  (the  reduced  gradient  for  the  subproblem),  which 
must  tend  to  zero  for  any  fixed  p.  The  size  of  ||r||  is  monitored  in  our  implementation,  and  the 
reduction  of  p  is  controlled  by  two  parameters  as  follows. 

1.  An  initial  “target  level”  for  ||r||  is  defined  to  be  r  =  ||fo||  *  RCFAC. 

2.  Whenever  ||r||  <  r,  the  barrier  parameter  is  reduced  to  p  *  MUFAC,  r  is  recomputed,  and  a 
new  target  level  is  defined  to  be  r  =  ||r||  *  RCFAC. 

The  parameters  RCFAC  and  MUFAC  should  lie  in  the  range  (0, 1)  to  be  meaningful.  For  example, 
the  values  RCFAC  =  0.99,  MUFAC  —  0.25  allow  a  moderate  reduction  in  p  almost  every  iteration, 
while  RCFAC  =  MUFAC  =  0.001  requests  more  discernible  progress  towards  optimality  for  each 
subproblem,  with  a  substantial  reduction  in  /t  on  rare  occasions. 

4.6.  Convergence  tests.  Three  other  parameters  are  needed  to  define  the  initial  and  final 
values  of  the  barrier  parameter,  and  the  degree  of  optimality  required  for  the  final  subproblem. 
In  an  effort  to  allow  for  poor  scaling  in  the  data,  we  initialize  p  to 

Po{l  +  |cT*o|)/(nln(l  +  llxoll)) 

for  some  pa,  and  whenever  a  newly  reduced  p  is  smaller  than 

+  |eTx|)/(nln(l  +  ||*||)) 

for  the  current  x,  pis  fixed  at  the  latter  value  for  the  remaining  iterations.  At  the  same  time, 
the  target  level  for  the  reduced  gradient  is  fixed  at 

*kta||*lllkll/» 


and  termination  occurs  when  ||r||  subsequently  falls  below  that  value. 
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In  our  experiments  we  have  used  po  =  0.1,  n-i-  *»  10~7,  and  r»i.  =>  10-7.  If  po  is  too  large 
a  danger  exists  on  problems  for  which  the  feasible  region  Ax  »  ht  s  >  0  is  unbounded;  since  the 
barrier  function  is  then  unbounded  below,  the  iterates  can  diverge  in  phase  1  before  the  artificial 
variable  (  reaches  zero. 

S.  Numerical  Results 

5.1.  Performance  of  the  barrier  method  on  a  standard  test  set.  hi  this  section  we 
summarise  the  performance  of  the  barrier  algorithm  described  in  Section  4  on  problems  from  an 
LP  test  set  in  rise  at  the  Systems  Optimization  Laboratory.  All  problems  are  in  the  form  (2.1). 
To  obtain  constraints  of  the  form  Ax  =  h,  any  general  inequality  constraints  are  converted  to 
equalities  using  slack  variables.  Details  of  the  problems  are  summarized  in  Table  1.  The  value 
of  “rows”  refers  to  the  number  of  general  constraints,  and  ‘columns*  to  the  number  of  variables, 
excluding  slacks.  The  number  ‘slacks*  is  defined  above.  The  column  M*  gives  the  number  of 
nonzeros  in  the  problem.  This  figure  includes  one  for  each  slack  but  excludes  the  nonzeros  in  1 
and  c. 

The  runs  summarized  in  Tables  2-6  were  made  in  double  precision  on  an  IBM  3081K  (relative 
precision  2.2  x  10~le).  The  source  code  was  compiled  with  the  IBM  Fortran  77  compiler  VS 
Fortran,  using  NOSDUMP,  NOSYM  and  0PT(3). 

Table  2  gives  the  number  of  iterations  and  CPU-seconds  required  by  the  the  primal  simplex 
method,  as  implemented  in  the  Fortran  code  MINOS  5.0  (May  1985).  The  default  values  of  the 
parameters  were  used  throughout  (see  Murtagh  and  Saunders,  1983).  Results  are  also  given  in  the 
case  where  the  constraints  are  scaled  by  an  iterative  procedure  that  makes  the  matrix  coefficients 
as  close  as  possible  to  one. 

Many  runs  were  made  incorporating  different  choices  for  the  parameters  8GFAC  and  NDFAC, 
which  specify  the  accuracy  of  a  given  subproblem  and  the  rate  at  which  the  barrier  parameter  is 
reduced.  One  aim  was  to  find  a  set  of  values  that  could  be  used  reliably  on  all  problems.  It  was 
found  that  RGFAC  =  0.1  and  MUFAC  =  0.1  gave  the  most  consistent  results.  Table  3  summarises 
the  performance  of  the  barrier  method  with  these  values.  The  second  and  third  columns  of  the 
table  give  the  number  of  iterations  to  obtain  a  feasible  point  and  the  total  iterations  requited 
to  satisfy  the  convergence  tests  of  Section  4.6.  The  fourth  column  gives  the  total  CPU  time  (in 
seconds)  to  solve  the  problem.  The  values  of  the  optimal  objective  function  found  by  MINOS  5.0 
were  used  to  judge  the  accuracy  of  the  final  objective  in  the  barrier  runs.  The  underlined  digits 
in  the  fifth  column  show  the  correct  figures  in  the  objective  function  on  termination.  The  final 
two  columns  indicate  the  degree  of  feasibility  and  optimality  of  the  final  point. 

Table  4  gives  the  results  of  applying  the  barrier  algorithm  with  the  same  scaling  procedure 
as  in  MINOS  5.0.  Note  that  scaling  alters  the  starting  point  ||i||e,  but  otherwise  its  effect  was 
substantial  only  on  Shareih. 

Table  5  illustrates  the  performance  of  a  single-phase  method  in  which  a  composite  objective 
function  of  the  form  weTx  +  (  was  used  throughout  (see  Section  4.2).  The  number  of  phase  1 
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iterations  is  sometimes  greater  than  that  for  w  *  0  (cf.  Table  4),  bat  the  total  number  of  iterations 
is  generally  less. 

Some  statistics  concerning  the  matrix  factorisations  used  in  MINOS  5.0  and  the  barrier 
method  are  provided  in  lhble  6.  As  in  Itble  1,  column  'A*  gives  the  number  of  nonseros  in 
the  problem.  The  columns  *fl*  and  *L  +  U*  give  the  number  of  nonseros  in  the  simplex  basis 
and  its  IAJ  factors  after  the  last  refactorisation  (this  is  typically  the  most  dense  factorisation  of 
all  those  performed).  Finally,  the  column  contains  the  number  of  nonseros  in  the  Cholesky 
factorisation  (4.S)  required  by  the  barrier  method. 


Thblo  1 

Problem  Statistics 


Problem 

Rows 

Slacks 

Columns 

A 

i«i 

1*1 

Afiro 

27 

19 

32 

102 

9.7  IS* 

3.9  it1 

ADLittle 

56 

41 

97 

424 

6.1  is* 

6.210* 

Share  Sb 

96 

83 

79 

777 

1.8  it* 

3.8io* 

Share  lb 

117 

28 

228 

1179 

1.3  it* 

7.7io* 

Beaconfd 

173 

33 

262 

3408 

1.6  is* 

1.2io* 

Israel 

174 

142 

2443 

9.1  is* 

5.6io* 

BrandY 

mm 

249 

1 

6.5  IS* 

8.710* 

BS26 

BS9 

282 

WSM 

9.6  It* 

4.1 10* 

BaniM 

El 

■Kfl 

472 

m 

1.5  is* 

3.0 10* 

Ifcble  1 

Results  from  the  primal  simplex  code  MINOS  5.0 


Optimal 

Objective 

Phase  1 

No  scaling 

Total 

Time 

Phase  1 

With  scaling 

Total 

Time 

Afiro 

-4.6475314 

2 

6 

0.5 

■1 

6 

0| 

ADLtttie 

28 

123 

1J 

SI 

98 

wm 

ShareBb 

-415.73224 

59 

91 

1.3 

74 

121 

1.4 

Share  lb 

■ 

135 

296 

3.4 

144 

260 

2.8 

Beaconfd 

8 

38 

1.9 

6 

39 

1.8 

Israel 

BJ  JJJ  '  • 

109 

345 

5.0 

41 

231 

3.7 

BrandY 

1518.5099 

176 

292 

4.9 

216 

377 

5.9 

Bite 

-18.751929 

109 

570 

9.4 

101 

471 

7.5 

BandM 

-158.62802 

167 

362 

7.6 

280 

534 

10.0 
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Problem 


Afiro 
AD  Little 
Share  Sb 
Shan  lb 
Beaconfd 
Israel 
BrandY 
B226 
BandM 


Tables 

No  scaling,  RCFAC  =  0.1,  MUFAC 


Phase  1  Total 


Time 

Objective 

0.4 

-464.75314 

1.1 

225434-96 

1.5 

-41573214 

4.9 

-26589.319 

9.9 

33592.486 

22.6 

-890644,82 

8.5 

1518,5099 

9.8 

-18.7-51929 

9.3 

-158.62802 

3.6 10- w 
8.8 


mm 


1.9 

2.4 

9.3  to-11 
4.5 10“ 11 

4.5  to"11 
9.4 10“ 12 
3.7  io-u 

1.4  io-10 

5.6  to”11 


Problem 


Afiro 

ADLittle 

Share  £b 

Shanlb 

Beaconfd 

Itrael 

BrandY 

E226 

BandM 


Table  4 

With  scaling,  RCFAC  =  0.1,  MUFAC 


Phase  1  Total 


Time 

Objective 

1144*4 

1*1 

0.4 

-404,75214 

6.3 10~12 

1.1 

225494-96 

6.0  io- 10 

1.6 

-415.73224 

4.9  io“# 

2.8 

-76589.319 

4.7 10“* 

11.1 

33592.486 

1.9  io~T 

24.1 

-896644.82 

1.6  IO"* 

9.2 

1518.5099 

1.3 10~7 

10.2 

-18.751929 

1.1 10“T 

9.8 

-158.62802 

1.7 10"* 

9.6 

2.2  IO"10 

5.4  io”u 

4.7  io” 12 

1.5  io"10 
5.0  io~12 

1.8  io-“ 
1.4  io”u 

4.3  io” 11 
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Tibi*  5 

Composite  objective  function 
With  selling,  BGFAC  =  0.1,  MUFAC  =  0.1,  u  =  0.01/||c|| 


Problem 

Time 

Objective 

Afiro 

4 

20 

0.4 

-464-75514 

9.3  io“10 

ADLittie 

17 

31 

1.0 

225494.96 

SUES 

9.7  io“u 

Sharetb 

8 

24 

1.6 

-415-75224 

4.4 10"10 

5.4 10"  u 

Share  lb 

7 

33 

2.8 

-76589.319 

4.4  10"* 

4.6 10"  “ 

Beaconfd 

24 

32 

9.6 

53592.486 

2.2 10-T 

2.5 10"u 

Israel 

11 

mm 

25.1 

-896644.82 

6.2l0“* 

4.8 10"u 

BrandY 

25 

8.1 

1518.5099 

6.7 10"T 

9.8  lo"1* 

B226 

21 

Bfl 

9.4 

-18,761929 

6.5  io“* 

1.5  io-“ 

BandM 

23 

35 

8.9 

-158.62802 

1.4  io“* 

3.2 10"11 

Tibi*  6 

Factorization  Statistics 


Problem 

A 

B 

L  +  U 

R 

Afiro 

102 

67 

67 

80 

ADLittie 

424 

261 

555 

Sharc2b 

777 

564 

697 

925 

Share  lb 

1179 

579 

1345 

Beaconfd 

1546 

1546 

2727 

Israel 

2443 

1644 

1664 

S533f 

BrandY 

2202 

1318 

1485 

3261 

B226 

2768 

1440 

n  _  i  t \^\ 

3416 

BandM 

2494 

2016 

2372 

4355 

fll259  if  six  dense  columns  are  included. 


5.3.  Obtaining  an  optimal  basic  solution.  By  its  very  nature,  a  barrier  method  can  at  best 
terminate  somewhere  "dose”  to  an  optimum.  We  must  then  ask:  how  close  is  ‘close’,  and  which 
of  the  several  characterisations  of  LP  optimality  are  we  close  to  achieving? 

In  practice,  LP  users  (and  their  report-writing  programs)  expect  alleged  optimal  solutions  to 
be  both  primal  and  dual  feasible,  thus  exhibiting  complementary  slackness.  The  last  column  in 
Tables  5-5  show  that  the  barrier  algorithm  can  attain  complementary  slackness  to  high  precision. 
However,  LP  users  also  expect  their  solutions  to  be  bask.  This  can  be  achieved  by  taking  the 
final  solution  from  the  barrier  algorithm  and  pro  resting  it  through  the  BA  SIC  procedure  common 


A  Projected  Newton  Barrier  Method 


17 


to  most  mathematical  programming  systems. 

The  BASIC  procedure  (sometimes  known  as  INSERT  by  value ;  see  Beniehou  et  a L,  1977) 
takes  a  set  of  variable  names  and  values  and  produces  a  basic  solution  that  has  at  least  as  good 
an  objective  value  or  sum  of  infeasibilities.  The  simplex  method  may  then  be  applied  to  reach 
optimality.  The  time  required  by  BASIC  and  the  simplex  method  together,  compared  to  the 
time  required  by  the  simplex  method  alone,  provides  another  practical  measure  of  closeness  to 
optimality. 

Some  experiments  of  this  kind  were  performed  to  test  the  quality  of  the  solutions  obtained  by 
the  barrier  algorithm.  An  early  implementation  of  the  barrier  algorithm  was  used,  with  a  slightly 
different  starting  point  for  phase  1  and  a  different  strategy  for  controlling  ft.  This  strategy 
initialised  n  to  10||e||/n  and  multiplied  /t  by  0.25  every  iteration  if  n  was  still  larger  than  10~*. 
The  algorithm  was  terminated  when  max/  |x/tf/|  <  10-7||e||||z||,  i.e.,  when  complementarity  was 
approximately  achieved. 

These  experiments  were  carried  out  on  an  IBM  3033N,  except  for  one  problem  DegenS  that 
was  run  on  an  IBM  3081K.  The  problems  used  were  an  available  subset  of  those  in  Table  1,  plus 
a  graduated  set  of  three  models  from  a  single  application,  ranging  from  small  to  medium  in  site 
(see  Table  7)  and  notable  for  their  severe  degeneracy. 

Initially,  all  problems  were  solved  from  scratch  by  Ketron’s  WHIZARD  optimiser,  which  was 
called  from  MPSIII  in  all  cases  except  for  DegenS,  where  the  host  was  MPSX/370.  The  first 
three  columns  of  Table  8  give  the  number  of  columns  selected  for  the  initial  basis  by  CRASH, 
the  number  of  simplex  iterations  required  to  reach  optimality,  and  the  CPU  time  in  seconds. 
(All  times  in  this  section  are  for  a  complete  MVS  job  step,  including  model  input  and  solution 
output.) 

The  next  three  columns  of  Table  8  give  results  for  the  barrier  algorithm.  For  the  smaller 
problems,  there  is  about  10  percent  overhead  for  input,  output,  symbolic  ordering  (subroutine 
Genqmd)  and  symbolic  factorization  (subroutine  Smbfet).  For  the  larger  problems,  the  cost  of 
the  single  call  to  Genqmd  became  significant:  2.4  seconds  for  BandM,  13.4  seconds  for  DegenS, 
and  237  seconds  for  DegenS.  Clearly  some  more  efficient  means  of  preprocessing  must  be  found 
for  large  problems. 

Table  7  compares  the  number  of  nonzeros  in  a  typical  WHIZARD  basis  factorization  with 
the  number  of  nonzeros  in  the  Cholesky  factors  R,  again  indicating  the  high  cost  of  solving  large 
least-squares  problems. 

The  last  two  columns  of  Table  8  show  the  work  required  by  BASIC  and  the  simplex  method 
to  reach  optimality,  starting  from  the  point  obtained  by  the  barrier  algorithm.  With  *  denoting 
the  basic  solution  obtained  by  BASIC,  the  quantities  (|t  -  Ax|]/|6|  were  observed  to  be  less  than 
10~*  in  all  cases,  and  the  values  of  |erx  -  eTz\l\eTz\  were  all  less  than  10~*.  The  number  of 
subsequent  simplex  iterations  appears  to  be  a  function  of  size  and  degeneracy,  the  two  being 
closely  related  in  this  sample.  Clearly,  the  primary  effect  of  the  post-BA 5/(7  iterations  is  to 
remove  dual  infeasibilities. 
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The  problems  BrandY,  BBSS  and  BandM  hare  23, 20  and  21  structurally  degenerate  variables 
respectively.  Ideally,  these  should  be  removed  by  the  PRBSOLVB  procedure  before  entering  the 
barrier  (or  simplex)  method.  They  can  be  restored  and  dual  feasibility  attained  at  trivial  cost  by 
the  POSTSOLVB  procedure  (see  Tomlin  and  Welch,  1983).  At  present  there  is  no  obvious  way 
of  circumventing  extreme  degeneracy  of  the  ordinary  type,  such  as  that  displayed  by  the  Degen 
problems.  However,  the  relative  number  of  post  -BASIC  simplex  iterations  required  for  these 
problems  appears  to  decline  with  problem  size,  compared  to  the  number  required  when  starting 
from  scratch.  It  may  well  be  that  non-simplex  methods  such  as  the  projective  and  barrier  methods 
will  prove  most  valuable  for  dealing  with  very  degenerate  LPs. 


Table  7 

Model  statistics  —  degenerate  problem  set 


Problem 

Rows 

Slacks 

Columns 

A 

B 

L  +  V 

R 

Degenl 

66 

15 

72 

296 

249 

251 

514 

DegenS 

444 

223 

534 

4894 

3076 

3718 

DegenS 

1503 

786 

1818 

25432 

18468 

119373 

Table  8 

Results  Cram  the  BASIC  procedure 


Crash 

Whisard 

Simplex 

Time 

Phase  1 

Barrier 

Total 

Time 

BASIC 

Simplex  Time 

Afiro 

0.5 

3 

14 

0.3 

vrm 

ADLittle 

8 

23 

1.2 

Shan  2b 

84 

Bfl 

6 

17 

1.8 

BrandY 

;1I> 

IBS 

12 

29 

10.8 

6 

Efl 

E226 

3.9 

15 

29 

13.0 

6 

19 

BandM 

3.5 

15 

28 

13.0 

1 

2.8 

Degenl 

38 

23 

0.7 

2 

15 

0.9 

0.7 

Degent 

187 

2650 

31.2 

13 

26 

54.9 

300 

7.3 

DegenS 

590 

8889 

226.0 

11 

25 

528.0 

593 

60.0 

6.  Future  Developments  and  Conclusions 

0.1.  Solving  the  least-squares  subproblem.  The  present  implementation,  as  in  Gay  (1985), 
uses  a  preconditioned  conjugate- gradien  t  method  to  solve  the  relevant  least-squares  subproblems. 
This  approach  allows  the  use  of  existing  software  for  computing  Cholesky  factors,  and  provides  a 
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convenient  way  of  dealing  with  a  few  dense  columns  of  A  that  would  degrade  the  sparsity  of  those 
factors.  Perhaps  further  efficiency  could  be  gained  by  discarding  small  nonzeros  in  the  product 
AD*At  (not  just  rows  and  columns  of  A)  before  computing  its  Cholesky  factors.  However,  a  new 
symbolic  factorization  would  then  be  required  at  every  stage,  not  just  once. 

Unfortunately,  it  seems  that  the  quality  of  the  preconditioner  must  remain  rather  high 
throughout,  since  an  iteration  of  LSQR  or  the  conjugate- gradient  method  requires  two  matrix- 
vector  products  involving  A  and  two  solves  with  the  preconditianer  R,  and  is  therefore  as  ex¬ 
pensive  (typically)  as  two  iterations  of  the  simplex  method.  If  the  average  number  of  iterations 
required  by  LSQR  were  20,  say,  then  the  barrier  algorithm  would  have  to  terminate  in  about 
l/50th  the  number  of  iterations  in  order  to  be  competitive  with  the  simplex  method,  even  if 
minimal  effort  were  required  to  obtain  R  each  iteration. 

To  illustrate,  the  test  problem  ferael  has  six  dense  columns  in  A.  With  these  excluded 
from  the  computation  of  R,  LSQR  required  an  average  of  12  iterations,  and  the  run  time  was 
correspondingly  high.  (On  the  other  hand,  retaining  all  columns  gave  an  R  with  three  times  as 
many  nonzeros  and  a  run-time  twice  as  great.) 

For  reasons  such  as  these,  we  remain  doubtful  that  good  preconditioners  can  be  computed 
with  adequate  efficiency  for  arbitrary  matrices  DAT,  i.e.,  for  arbitrary  linear  programs.  Only  for 
special  classes  of  problem  does  there  seem  to  be  room  for  optimism;  for  example,  those  exhibiting 
a  block-triangular  structure  with  many  email  diagonal  Mocks. 

In  place  of  the  iterative  methods  just  described,  one  can  employ  a  sparse  orthogonal  factor¬ 
isation  of  the  form 


DAt  =  Q  ^  ^  ,  QtQ  =*  /,  R  upper  triangular  (6.1) 

to  solve  the  least-squares  problems  directly,  where  R  is  analytically  the  same  as  the  Cholesky 
factor  of  AD7At.  General-purpose  software  exists  for  this  computation,  in  particular  SPARSPAK- 
B  (George  and  Ng,  1984),  which  has  excellent  numerical  properties  and  is  able  to  treat  dense 
rows  of  DAt  specially  in  order  to  preserve  the  sparsity  of  R.  Its  use  in  this  context  merits  future 
investigation. 

A  further  direct  approach  is  to  apply  a  sparse  indefinite  solver  to  the  symmetric  system  (2.10). 
The  MA£7  package  of  Duff  and  Reid  (1982)  is  applicable,  and  we  believe  it  shows  considerable 
promise.  As  with  the  sparse  QR  (6.1),  a  single  symbolic  factorization  would  serve  all  iterations. 
In  addition,  dense  columns  in  A  would  not  drastically  affect  the  sparsity  of  the  factors. 

6.3.  Adjusting  thu  barrier  parameter.  Numerous  authors  have  suggested  extrapolation 
techniques  in  connection  with  barrier  functions.  We  have  conducted  some  limited  experiments, 
as  follows.  The  barrier  function  was  minimized  reasonably  accurately  for  two  quite  large  values 
of  M  OMI/n  end  0.1||e||/n),  and  extrapolation  was  then  performed.  The  resulting  solution  was 
accurate  to  about  five  figures.  However,  it  is  difficult  to  evaluate  the  practical  merits  of  this 
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approach  without  further  study.  Recall  that  such  a  strategy  would  need  to  be  applied  to  both 
phases. 

A  number  of  suggestions  have  been  made  for  automating  the  choice  of  ft.  The  method  of 
centers  (see  Fiacco  and  McCormick,  1968;  Huard,  1967)  is  in  essence  a  barrier-function  method 
in  which  a  transformation  is  also  applied  to  the  objective  function,  thereby  negating  the  need  for 
a  barrier  parameter.  See  also  Todd  and  Burrell  (1985)  for  a  discussion  of  this  approach. 

It  should  be  stressed,  however,  that  the  freedom  to  choose  ft  may  well  be  an  asset,  especially 
in  the  case  of  linear  programs.  This  is  confirmed  by  our  experience  with  the  conservative  strategy 
of  allowing  j*  to  be  reduced  only  occasionally.  Considerable  progress  is  then  often  achieved  before 
the  least-squares  problems  become  unduly  ill-conditioned. 

6.8.  Use  of  th«  entropy  function.  Because  of  the  similarities,  we  note  the  work  of  many 
authors  on  incorporating  the  entropy  function  into  linear  programming  models,  hi  place  of 
subproblem  (2.2),  one  can  consider  the  subproblem 

m 

minimise  cTx  +  p  x,-  ha  x,- 
»e«"  .. 

subject  to  Ax  =  b, 

where  the  scalar  /*  (/*  >  0)  is  again  specified  for  each  subproblem.  Erlander  (1977)  reviews 
problems  of  this  kind  and  suggests  Newton-type  methods  for  their  solution.  Computational 
algorithms  have  been  developed  by  Eriksson  (1980,  1981). 

If  a  feasible-point  descent  method  is  applied  as  in  Section  2,  the  Newton  search  direction  and 
Lagrange-multiplier  estimates  satisfy  the  system 

r;‘  f)(r)-cr) 

in  place  of  (2.9),  where  D  —  diag(zy)  and  v  has  components  ey  *  1  +  ln*y.  A  least-squares 
subproblem  follows  as  before.  In  terms  of  the  algorithm  of  Section  4.1,  r  would  be  defined  as 
r  =  D*/2( n  +  nv)  in  steps  1,  3  and  5,  and  D  would  be  changed  to  Dxl%  in  the  least-squares 
problem  (4.1).  Finally,  we  would  have  p  =*  -(1  /it)Dlf*r  in  step  5. 

The  entropy  function  is  convex  and  (unlike  the  logarithmic  barrier  function)  it  is  bounded 
below.  Also,  since  its  Hessian  is  fiD~l  rather  than  pD~7,  the  least-squares  problems  should  be 
better  conditioned  as  the  LP  solution  is  approached.  Further  computational  work  therefore  seems 
to  be  justified,  either  as  in  Eriksson  (1980,  1981)  or  along  the  lines  suggested  here. 

6.4.  Conclusions.  Most  qualitative  aspects  of  Harm  ark  ar’s  projective  method  can  be  found  in 
the  projected  Newton  barrier  algorithm  described  here.  The  nonlinear  programming  viewpoint 
has  provided  us  with  an  armory  of  known  theoretical  and  practical  techniques,  to  be  applied 
to  convergence  analysis  and  computer  implementation.  Tb  date,  a  casualty  appears  to  be  the 
proof  of  polynomial  complexity.  While  this  may  seem  a  serious  loss,  the  number  of  iterations 


/•/'.'aVa 


A  Projected  Newton  Bonier  Method 


tl 


required  by  the  barrier  algorithm  appear*  to  be  similar  in  practice  to  that  reported  for  various 
projective  methods  (cf.  Tomlin,  1985,  and  Lustig,  1985),  and  our  implementation  has  proved  to 
be  competitive  with  the  simplex  method  on  some  of  a  limited  class  of  moderate-sised  problems 
(all  of  which  are  degenerate  and  poorly  sealed). 

Two  facts  remain: 

•  large-scale  least-squares  problems  can  be  very  expensive  to  solve  (compared  to  square  systems 
Bx  -  5); 

•  interior-point  methods  are  inescapably  minimising  a  highly  nonlinear  function. 

In  view  of  the  second  point,  poor  performance  can  be  expected  if  variables  migrate  prematurely 
towards  the  singularities  at  their  bounds.  It  has  been  difficult  to  develop  a  robust  algorithm  for 
this  reason.  The  same  difficulty  arises  if  a  ‘good*  starting  point  is  available  from  an  earlier  run 
on  a  similar  problem  (by  far  the  most  common  situation),  since  such  a  solution  will  have  many 
variables  on  or  close  to  their  bounds. 

From  the  computational  evidence  to  date,  the  future  for  interior-point  methods  seems  bright¬ 
est  in  the  context  of  cold-start  solution  of  very  large,  specially  structured  problems. 
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